Rank Diversity of Languages: Generic Behavior in Computational Linguistics
نویسندگان
چکیده
منابع مشابه
Rank Diversity of Languages: Generic Behavior in Computational Linguistics
Statistical studies of languages have focused on the rank-frequency distribution of words. Instead, we introduce here a measure of how word ranks change in time and call this distribution rank diversity. We calculate this diversity for books published in six European languages since 1800, and find that it follows a universal lognormal distribution. Based on the mean and standard deviation assoc...
متن کاملPanel: Computational Linguistics Research on Philippine Languages
This is a paper that describes computational linguistic activities on Philippines languages. The Philippines is an archipelago with vast numbers of islands and numerous languages. The tasks of understanding, representing and implementing these languages require enormous work. An extensive amount of work has been done on understanding at least some of the major Philippine languages, but little h...
متن کاملTotal Rank Distance And Scaled Total Rank Distance: Two Alternative Metrics In Computational Linguistics
In this paper we propose two metrics to be used in various fields of computational linguistics area. Our construction is based on the supposition that in most of the natural languages the most important information is carried by the first part of the unit. We introduce total rank distance and scaled total rank distance, we prove that they are metrics and investigate their max and expected value...
متن کاملLinguistics in Computational Linguistics: Observations and Predictions
As my title suggests, this position paper focuses on the relevance of linguistics in NLP instead of asking the inverse question. Although the question about the role of computational linguistics in the study of language may theoretically be much more interesting than the selected topic, I feel that my choice is more appropriate for the purpose and context of this workshop. This position paper s...
متن کاملComputational Linguistics
This book describes the theoretical underpinnings and results of Rosetta, a machine translation (MT) project that started at Philips Research Laboratory in the early 1980's; the book focuses on research carried out between 1985 and 1992. While the project was a collective enterprise among a large number of people (as the pen name indicates), the principal authors were Lisette Appelo, Theo Janss...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: PLOS ONE
سال: 2015
ISSN: 1932-6203
DOI: 10.1371/journal.pone.0121898